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EXPERT REVIEW 

Schizophrenia genetics: emerging themes for a complex 
disorder 

DH Kavanagh, KE Tansey, MC O'Donovan and MJ Owen 

After two decades of frustration, genetic studies of schizophrenia have entered an era of spectacular success. Advances in 
genotyping technologies and high throughput sequencing, increasing analytic rigour and collaborative efforts on a global scale 
have generated a profusion of new findings. The broad conclusions from these studies are threefold: (1) schizophrenia is a highly 
polygenic disorder with a complex array of contributing risk loci across the allelic frequency spectrum; (2) many psychiatric illnesses 
share risk genes and alleles, specifically, schizophrenia has substantial overlaps with bipolar disorder, intellectual disability, major 
depressive disorder and autism spectrum disorders; and (3) some convergent biological themes are emerging from studies of 
schizophrenia and related disorders. In this commentary, we focus on the very recent findings that have emerged in the past 
12 months, and in particular, the areas of convergence that are beginning to emerge from multiple study designs. 
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INTRODUCTION 

It has been known for many years on the basis of a large number 
of family, twin and adoption studies that genetics has an 
important role in conferring risk to schizophrenia. 1 Identification 
of genetic risk at the level of DNA variation has, however, proved 
challenging with replicable findings hard to come by and the field 
has endured cycles of optimism and disappointment. However in 
the past few years this has begun to change. Success has been 
built largely on three main developments in addition to the 
publication of the sequence of the human genome and the 
increasingly detailed documentation of population variation in 
that sequence. First technology has enabled analyses of genetic 
variation on a genome-wide scale, both for common alleles 
through genome-wide association studies (GWAS) as well as rare 
mutations through copy number variant (CNV) analysis. More 
recently, the task of seeking rare pathogenic single base and 
insertion-deletion polymorphisms has been facilitated by the 
development of whole exome and whole genome sequencing. 
Second, most geneticists have realised the importance of applying 
rigorous statistical criteria. Finally, with the realisation that 
common risk alleles confer very small individual risks has come 
the appreciation that very large samples are required to satisfy 
stringent statistical thresholds. This has led geneticists, psychiatric 
and otherwise, to collaborate to an extent that hitherto has been 
unusual in the biological sciences (http://www.med.unc.edu/pgc). 
In this paper, we review very recent genomic findings in 
schizophrenia and consider their implications. 

RECENT HISTORY 

Common genetic variation 

The Wellcome Trust Case Control Consortium study of seven 
common diseases, including 2000 bipolar cases, was a landmark in 
genetics. Although it did not identify a definitive association to 



bipolar disorder, it clarified the boundaries of expectation 
regarding the effect sizes of common risk alleles for a range of 
complex disorders. To many psychiatric geneticists, the implica- 
tions were clear; much larger samples were required to capture 
the small effect sizes (odds ratios < 1 .2) typical of common risk 
alleles for complex disorders. 

The first wave of successful schizophrenia GWAS, which 
between them identified fewer than five risk loci, echoed this 
conclusion. 25 They also showed that schizophrenia was highly 
polygenic, and even more so than was generally expected. The 
clearest demonstration of this came from the study of the 
International Schizophrenia Consortium. 5 That group showed that 
an aggregate score representing the number of single-nucleotide 
polymorphisms (SNPs) selected for even weak evidence of 
association in their study was higher in schizophrenia cases than 
controls in independent studies. Modelling suggested that the 
signal from this 'polygenic score' was being driven by hundreds, 
and likely more than a thousand, of individual susceptibility SNPs, 
which together could explain approximately one-third of the 
genetic liability. 5 

Over the next few years, studies (refs 2, 6-9 and others reviewed 
in ref 10) from either informal collaborations or more formalised 
consortia detected incrementally more risk loci such that by the 
end of 2013 around 30 loci had been reported at genome-wide 
significance, albeit not always in samples that were well enough 
powered to give confidence in the findings. The largest single 
study prior to 2014 undertook a meta-analysis of new data with 
that from the Psychiatric Genome Wide Association Consortium 
(now known as the Psychiatric Genomics Consortium or PGC) 11 ; 
including replication the sample comprised more than 21 000 
cases and 38 000 controls. In total, 22 nonoverlapping genomic 
loci reached genome-wide significance (P<5x10 -8 ). These 
included 8 previously associated regions and 13 novel associa- 
tions. It should be noted at this point that when we refer to risk 
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loci, we refer to regions of the genome that contains one or more 
allele that is associated with disorder at a level corresponding to 
genome-wide significance. However, because of linkage disequili- 
brium, typically, a region contains many strongly or partially 
correlated alleles, any of which might be the actual pathogenic 
DNA variant. Similarly, when we refer to a risk allele, we refer only 
to association, but for the same reason, we do not intend to 
suggest that is the variant that directly alters gene function. 
Moreover, as multiple correlated SNPs within a locus often span 
multiple genes, and sometimes do not span any known genes, it 
follows that association does not unequivocally implicates a 
specific causal gene. Nevertheless, genes involved in a number of 
broad biological themes were found to be enriched within the 
regions of association, particularly calcium signalling and genes 
predicted to be regulated by microRNA MIR137. Interesting as 
those results were, the evidence for any of the biological themes 
was not definitive. 

Another important finding of the International Schizophrenia 
Consortium was that the schizophrenia polygenic score is not only 
higher in people with schizophrenia, it is also higher in people 
with bipolar disorder than in controls. This indicates that the 
polygenic contribution to the two disorders is substantially shared, 
a conclusion that was also reflected in earlier studies which found 
evidence for shared risk alleles at individual genes including 
ZNF804A 2 and CACNA1C} 2 Subsequent studies have compared 
and combined GWAS of schizophrenia, bipolar disorder, autism 
spectrum disorders, major depressive disorder and attention- 
deficit and hyperactivity disorder. Again, evidence for shared risk 
was observed for specific risk variants, but more importantly, 
substantial overlap was found between the three adult onset 
disorders, schizophrenia, bipolar disorder and major depressive 
disorder, and a reduced, yet still significant overlap between 
schizophrenia and autism spectrum disorder. 1315 

The overall conclusions from GWAS can be summarised as 
follows. First, the common variant contribution is not only 
substantial but also highly polygenic. Second, despite a possible 
increase in the heterogeneity, larger sample size leads to more 
genome-wide significant associations each of which potentially 
offers a window into the biology of schizophrenia. Third, and in 
contrast to APOE in Alzheimer's disease, and the major histo- 
compatibility complex in some auto-immune disorders, there are 
no common variants that individually contribute substantially to 
liability. Fourth, the cross-disorder analyses reveal substantial 
genetic overlap between the adult onset disorders, and a more 
modest amount of overlap between the schizophrenia and child- 
hood onset disorders. This shared risk highlights the importance 
and potential utility of genetic findings to inform the aetiological 
and pathophysiological relationships between these syndromi- 
cally defined disorders. 

Rare genetic variation 

The GWAS design typically investigates the higher end of the 
allelic frequency spectrum (minor allele frequency> 1%) but it has 
been known for some time that rare variation also plays a role in 
schizophrenia. The earliest definitive report of schizophrenia- 
associated rare variation concerned deletions at chromosome 
22q1 1.2. 16 This deletion CNV confers a substantial (about 25-fold) 
increase in risk for schizophrenia as well as other psychiatric and 
neurodevelopmental phenotypes. 17 As GWAS technology began 
to permit genome-wide CNV scans, so evidence has accrued for a 
wider role in the disorder for CNVs. 

In general, people with schizophrenia have an increased burden 
of large (>100kb) rare (frequency < 1%) CNVs compared with 
controls. They also have an increased frequency of de novo 
CNVs. 1719 To date, CNVs at several distinct loci have been strongly 
implicated in schizophrenia, 17 ' 20,21 and, just as for deletions at 
22q11.2, the effects of these CNVs are not specific; all are 



associated with at least one other neurodevelopmental and 
psychiatric condition including intellectual disability, autism 
spectrum disorders and attention-deficit and hyperactivity dis- 
order. 2127 CNVs typically affect many genes, so cross-disorder 
effects cannot be attributed with confidence to a shared risk gene 
at any given multi-genic locus. Nevertheless, the findings are at 
least suggestive of partial sharing in genetic risk across multiple 
disorders. 

Studies of CNVs, particularly of de novo CNVs, have yielded 
insights into biological processes that are perturbed in schizo- 
phrenia. CNVs preferentially disrupt genes involved in neurode- 
velopmental pathways. 28 More specifically, there is strong 
evidence that they are enriched for genes in the postsynaptic 
density that play a role in modulating synaptic strength at 
glutamatergic synapses, particularly genes encoding members of 
the N-methyl-d-aspartate receptor (NMDAR) complex and the 
activity-regulated cytoskeleton-associated (ARC) protein com- 
plex. 19 Recently, independent gene-set association analysis of 
case CNVs discovered in a genome-wide screen of cases and 
controls provided additional support for genes encoding protein 
members of the postsynaptic density 21 as well as for enrichment 
among case CNVs for calcium channel signalling genes and 
targets of the fragile X mental retardation protein (FMRP). 

In general, schizophrenia-associated CNVs have large individual 
effect sizes but are extremely rare in the population, whereas 
associated common variants have small individual effect sizes but 
are common in the population. The effect sizes of de novo point 
mutations are currently unclear. The aggregate effect of CNVs and 
SNPs, both de novo and inherited, across the allelic frequency 
spectrum must be considered in order to determine the contri- 
bution of genetic effects to schizophrenia. 

MAJOR GWAS AND EXOME SEQUENCING STUDIES OF 2014 

GWAS 

The recently published second GWAS paper from the Schizo- 
phrenia Working Group of the PGC 29 comprised an analysis of 49 
nonoverlapping samples containing 34 241 cases and 45 604 
controls as well as 1235 parent affected-offspring trios. In total, 
this more than doubled the sample sizes used in the previously 
largest GWAS. 11 

Summary data for an additional sample of 1513 cases and 
66,236 controls were obtained from deCODE genetics for linkage 
disequilibrium-independent-associated SNPs at P-value < 1 x 10~ 6 . 
Meta-analysis of these datasets resulted in a total of 128 statistically 
independent schizophrenia associations in 108 distinct genomic 
loci. The 108 loci included 25 previously reported, and 83 novel, 
loci. The continued and extended support for previously reported 
loci demonstrates the reliability of the earlier GWAS results built on 
large sample sizes, a point that had been demonstrated by 
extensive and fully independent replication of loci identified in the 
first PGC schizophrenia study. 30 

Of the associated loci, most (75%) contained 1 or more protein- 
coding gene. Perhaps most notably of all, one of the loci contains 
the Dopamine receptor D2 (DRD2) gene, which encodes the main 
target of all effective anti-psychotic drugs. This is the first strong 
link between genetic susceptibility to schizophrenia and the 
mechanism of action underpinning its treatment. It also provides 
an important reminder that, despite the small effect sizes asso- 
ciated with individual risk alleles, GWAS can identify treatment 
targets, modulation of which can have profound effects on the 
disorder. 31 Other associated loci are notable for containing 
glutamate receptors {GRIA1, GRIN2A and GRM3) and members 
of the voltage-gated calcium ion channel family of proteins 
(CACNAIC, CACNA1I and CACNB2) as well as many genes involved 
in synaptic plasticity. Together with recent findings from 
sequencing studies reviewed below, these associations provide a 
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broader body of evidence that disruption of the glutamate system 
and neuronal calcium homeostasis contribute to the pathophy- 
siology of schizophrenia. Though these genes are all plausible 
biological candidates the authors were careful to point out that 
they may not necessarily be the causal elements within the 
associated loci. 

Using histone acetylation markers from various sources includ- 
ing the ENCODE project 32 to define enhancer elements, if not 
unsurprisingly, then reassuringly, the PGC group found associa- 
tions were enriched in enhancers that were relatively brain- 
specific. However, they also found associations were enriched in 
enhancers that are active in immune cells and tissues, even after 
removing those with prominent activity in brain as well as 
excluding the poorly localised signal in the major histocompat- 
ibility complex region. This provides support for the hypothesis 
that reports of abnormal immune function in schizophrenia may 
reflect a causal disruption in immunity in schizophrenia rather 
than an epiphenomenon. 33 However, caution is required here 
until links can be made between the associations at immune 
system enhancers and altered function in specific immune genes, 
and between changes in the functions of those genes and altered 
immune function. Nevertheless, the findings provide a further 
impetus for studying immune function in schizophrenia. 

Polygene score analysis showed the amount of variance in case- 
control status explained by common additive genetic effects rising 
from previous estimates to 18% as measured by Nagelkerke R 2 . 
Even so, predicting diagnostic status by polygenic score results in 
a very-high degree of diagnostic misclassification, indicating that 
such analyses are not yet clinically useful. 

Although the recent study represents a step change in 
discovering genetic susceptibility loci, an important limitation of 
the study was that plausible functional variants could not be 
identified for most of the findings. Only 10 of the index 
associations could potentially (though not definitively) be ascribed 
to a nonsynonymous coding variant, and only 12 could be 
credibly explained by a known expression quantitative trait locus 
(eQTLs). This highlights the need for a richer resource of 
annotation data if GWAS results are to be fully exploited for 
biological understanding. 

Sequencing studies 

Another recent landmark in schizophrenia genetics was the back- 
to-back publication in 2014 of two fairly large studies that used 
sequencing technology to screen most of the known coding 
exome for rare single-nucleotide variants (SNVs) and small 
insertions or deletions (indels) that might affect risk for schizo- 
phrenia. 34,35 

Prior to 2014, a series of small sequencing studies provided 
support for the hypothesis that, just as for de novo CNVs, the 
rate of de novo SNVs and indel mutations was increased in 
schizophrenia 35-38 Others also reached similar conclusions for 
autism 39-42 and, more so, intellectual disability. 43,44 Moreover, 
since this type of mutation preferentially occurs in older men 45 , it 
was thought de novo mutations might partially explain the 
increased risks for schizophrenia in children whose fathers are 
relatively old at the time of conception. 46 

However, in the largest study of de novo mutations in 
schizophrenia to date, no evidence was found for an overall 
increase in the rate of nonsynonymous or loss-of-function de novo 
mutations, suggesting that de novo mutations play a lesser role in 
schizophrenia than has been indicated by the earlier studies, or for 
the other disorders. 4 

Despite the lack of evidence for a general elevation in the rate 
of de novo mutations, there was significant enrichment of 
nonsynonymous de novo mutations in glutamatergic postsynaptic 
proteins comprising the ARC and NMDAR complexes. 19 De novo 
mutations were additionally enriched in other proteins that are 



hypothesised to modulate synaptic strength, specifically proteins 
regulating actin filament dynamics and those whose mRNAs are 
targets of FMRP also shown to be enriched in autism spectrum 
disorder de novo mutations 42 and in a schizophrenia case-control 
CNV study. 21 

A second observation, subsequently also reported in a much 
smaller study, was that de novo mutations in schizophrenia 
occurred more frequently in genes affected by de novo mutations 
in autism spectrum disorder and intellectual disability. 47 Unlike 
CNV associations, the overlaps in de novo mutations point to 
overlapping risk at single gene rather than locus resolution, and 
thereby provide more definitive evidence for pleiotropy. 

In the study of Fromer et al., although cases in general had no 
elevation in de novo mutation rates, de novo loss-of-function 
mutations occurred more frequently than expected in people with 
schizophrenia who also had lower premorbid educational attain- 
ment, suggesting that de novo loss-of-function mutations may 
have a role in neurodevelopmental impairment across diagnostic 
boundaries. However, on average, schizophrenia de novo muta- 
tions were predicted to be less damaging to the protein function 
than those in people with autism or intellectual disability, a finding 
that is consistent with the hypothesis that schizophrenia occupies 
a less extreme position on a neurodevelopmental gradient of 
impairment than the other two disorders. 34 The hypothesis of 
shared genetic risk has obtained further support from the recent 
GWAs study of the PGC 29 in which is was noted that GWAS 
significant loci are enriched for genes affected by de novo 
mutations in autism and intellectual disability. 

The second, and larger, exome sequencing study examined rare 
variants predicted to be damaging using a case-control study 
design of approximately 2 500 cases and 2 500 controls. 35 No 
specific gene showed a genome-wide corrected significant excess 
of rare mutations in cases. However, a large set of around 2 500 
genes selected a priori by the authors to be likely enriched for 
schizophrenia susceptibility genes (for example, genes with de 
novo mutations, genes in GWAS regions, mapping to CNVs, 
members of the ARC, NMDAR and postsynaptic density pathways) 
showed an increased burden of rare nonsense and disruptive 
variants in cases compared with the controls. This burden was 
attributable to mutations in a large number of genes suggesting 
that the mutational target of schizophrenia encompasses many 
hundreds of genes. Modelling suggests that the impact on disease 
risk of rare CNVs and disruptive mutations may be an order of 
magnitude smaller relative to common SNPs. 

In terms of implications for pathophysiology, notable findings 
were convergence with the other exome sequencing study 34 in 
finding enrichment for rare damaging mutations in the ARC and 
NMDAR complexes, and in targets of FMRP. Other convergences 
were noted with GWAS data; most notably genes encoding 
voltage-gated calcium ion channel proteins were also enriched. 
However unlike the large de novo study, 34 there was no overlap 
between the mutations observed in schizophrenia and those 
discovered in either autism spectrum disorders or intellectual 
disability. 

As noted in the original publications these studies lacked power 
to implicate specific genes and rare mutations in schizophrenia. 
However, they do allow us to conclude that rare, as well as 
common, single-nucleotide variation plays a role in schizophrenia 
though the relative contribution of the two classes of variant 
remains uncertain pending much larger sequencing studies as 
does the contribution of rare variation outside the exome. In view 
of the lack of power to implicate specific genes and variants, the 
authors took a hypothesis driven, gene-set approach. Both studies 
have provided support for specific gene sets that have been 
implicated in previous studies and this convergence between the 
two exome sequencing studies is striking. However, they were 
limited in that they examined only a small proportion of the 
neurobiologcal processes and structures that could potentially be 
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implicated in schizophrenia. This limitation was imposed by the 
desire to focus on circumscribed, well-annotated gene sets 
with strong a priori evidence for involvement in schizophrenia. 
It seems certain that other important areas of biology remain to be 
discovered but that this will require much larger samples and 
much better annotated genomic resources (see below). 

CONCLUSIONS 

The combination of new technology, extensive collaboration and 
persistence in the face of a sometimes challenging funding 
climate has been rewarding for schizophrenia genetics. Over 100 
specific common risk loci 29 and at least 1 1 rare risk alleles 20 have 
been identified. Moreover it is clear that these represent the tip 
of an iceberg of genetic complexity. Evidence for risk effects 
is seen across the entire allelic frequency spectrum, from private 
de novo mutations to common SNPs. In addition evidence from 
both GWAS 29 and early sequencing studies 35 suggest that the 
mutational target for schizophrenia is likely to be extensive 
involving hundreds and very likely thousands of genes. 

The second finding, which is perhaps not surprising given the 
degree of genetic complexity and the continuous nature of many 
psychiatric traits, 48 is that genetic risk does not map neatly on 
psychiatric diagnoses. There is evidence for shared genetic risk 
between schizophrenia, bipolar disorder, autism spectrum dis- 
orders, intellectual disability and attention-deficit and hyperactiv- 
ity disorder. These point to shared disease mechanisms and to the 
need for approaches to patient stratification for research that go 
beyond the current Diagnostic and Statistical Manual of the 
American Psychiatric Association/lnternational Classification of 
diseases categorical approaches used in the clinic. 4 ' 4 Recent 
genetic findings also support the hypothesis that schizophrenia 
can be conceived of as part of a spectrum of neurodevelopmental 
disorders ordered by severity with identification at one extreme 
and mood disorders at the other. 50 

Perhaps most encouragingly, despite the complexity of the 
genetic picture that is emerging, we are beginning to get glimpses 
of convergence onto a coherent set of biological processes. 
Results from GWAS, 11 ' 29 CNV 19 and sequencing studies 34 ' 35 point 
to a functionally related set of synaptic proteins involved in 
synaptic plasticity, learning and memory. Among the gene sets 
that can be tied to these processes are the ARC and NMDAR 
complexes, targets of FMRP, and voltage-gated calcium channels, 
all implicated by rare variant studies, and in the case of FMRP 
targets and voltage-gated calcium channels, common variant 
studies as well, including the most recent PGC GWAS. 29 Moreover, 
although the limited gene-set analyses reported by the PGC GWAS 
did not identify enrichments among the synaptic gene sets, 
associations to loci containing individual N-methyl-d-aspartate 
(GRIN2A), a-amino-3-hydroxy-5-methyl-4-isoxazolepropionic acid 
(GRIAl) and metabotropic glutamate (GRM3) receptor genes as 
well as affiliated proteins point broadly in that direction. A more 
detailed gene-set analysis of the PGC GWAS has been undertaken 
and is being prepared for publication. 

Although, it seems highly likely that there are other systems and 
mechanisms involved, including the dopamine system, and we 
can expect these to emerge with further and larger GWAS and 
sequencing studies, the current findings offer numerous entry 
points for neuroscientists to probe the biological basis of 
schizophrenia. Of course the sheer genetic complexity poses 
many challenges for such studies 51 and we should not forget that 
with respect to the GWAS findings, we have yet to identify the 
actual risk variants and their proximal functional consequences at 
the gene level. The relatively sparse annotation of the genome, 
regulome and proteome across multiple brain regions, specific cell 
types and developmental periods hinders the translation of 
genetic findings into mechanistic understanding. The bridging 
of this 'annotation gap' coupled with larger, well-powered genetic 



studies will aid in elucidating both healthy and diseased brain 
function, and potentially provide further insights into drug 
discovery and nosology. However, the massive multiple testing 
burden faced by genomic studies, and the proliferation of analy- 
tical methods, means that this endeavour will need to overcome 
methodological challenges to avoid a proliferation of false positive 
findings. 

A further avenue for defining the effects of specific high- 
penetrance mutations will be to return to patients carrying the 
mutations for more extensive phenotyping using the plethora of 
approaches now available to clinical neuroscientists. Direct com- 
parison with animal and cellular models, including those using 
induced pluripotent stem cells and new genomic editing approa- 
ches of the same mutations will also likely be informative. 52,53 
Detailed phenotyping studies of individual common variants 
have proliferated in recent years, but these face a number of 
methodological difficulties 54 The development of methods to 
measure the en masse effects of risk SNPs in individuals, such 
as the polygene score approach mentioned above, offer new 
approaches in studying the impact of genetic risk on brain 
function using studies of unaffected as well as affected individuals. 

Finally, we should note the successes in schizophrenia genetics 
are unlikely to have occurred because of the fundamental 
differences in the genetic architecture of this disorder compared 
with many other psychiatric disorders. Although in the case of 
some very-high-frequency disorders such as depression, the 
genetic architecture and or heterogeneity may be particularly 
challenging, for many other disorders, similar success is likely to 
follow the application of larger sample sizes. 
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